Search CORE

12 research outputs found

Sharp estimation in sup norm with random design

Author: Gaiffas Stéphane
Publication venue
Publication date: 01/01/2005
Field of study

The aim of this paper is to recover the regression function with sup norm loss. We construct an asymptotically sharp estimator which converges with the spatially dependent rate r\_{n, \mu}(x) = P \big(\log n / (n \mu(x)) \big)^{s / (2s + 1)}, where

\mu

is the design density,

s

the regression smoothness,

n

the sample size and

P

is a constant expressed in terms of a solution to a problem of optimal recovery as in Donoho (1994). We prove this result under the assumption that

\mu

is positive and continuous. This estimator combines kernel and local polynomial methods, where the kernel is given by optimal recovery, which allows to prove the result up to the constants for any

s > 0

. Moreover, the estimator does not depend on

\mu

. We prove that

r\_{n, \mu}(x)

is optimal in a sense which is stronger than the classical minimax lower bound. Then, an inhomogeneous confidence band is proposed. This band has a non constant length which depends on the local amount of data

arXiv.org e-Print Archive

CiteSeerX

Hal-Diderot

Convergence rates for pointwise curve estimation with a degenerate design

Author: Gaiffas Stéphane
Publication venue
Publication date: 13/01/2006
Field of study

The nonparametric regression with a random design model is considered. We want to recover the regression function at a point x where the design density is vanishing or exploding. Depending on assumptions on the regression function local regularity and on the design local behaviour, we find several minimax rates. These rates lie in a wide range, from slow l(n) rates where l(.) is slowly varying (for instance (log n)^(-1)) to fast n^(-1/2) * l(n) rates. If the continuity modulus of the regression function at x can be bounded from above by a s-regularly varying function, and if the design density is b-regularly varying, we prove that the minimax convergence rate at x is n^(-s/(1+2s+b)) * l(n)

arXiv.org e-Print Archive

Hal-Diderot

Régression non-paramétrique et information spatialement inhomogène

Author: Gaiffas Stéphane
Publication venue: HAL CCSD
Publication date: 08/12/2005
Field of study

We study the nonparametric estimation of a signal based on inhomogeneous noisy data (the amount of data varies on the estimation domain). We consider the model of nonparametric regression with random design. Our aim is to understand the consequences of inhomogeneous data on the estimation problem in the minimax setup. Our approach is twofold: local and global. In the local setup, we want to recover the regression at a point with little, or much data. By translating this property into several assumptions on the design density, we obtain a large range of new minimax rates, containing very slow and very fast rates. Then, we construct a smoothness adaptive procedure, and we show that it converges with aminimax rate penalised by a minimal cost. In the global setup, we wantto recover the regression with sup norm loss. We propose estimatorsconverging with rates which are sensitive to the inhomogeneousbehaviour of the information in the model. We prove the spatialoptimality of these rates, which consists in an enforcement of theclassical minimax lower bound for sup norm loss. In particular, weconstruct an asymptotically sharp estimator over Hölder balls withany smoothness, and a confidence band with a width which adapts to thelocal amount of data.Nous étudions l'estimation non-paramétrique d'un signal à partir dedonnées bruitées spatialement inhomogènes (données dont la quantitévarie sur le domaine d'estimation). Le prototype d'étude est le modèlede régression avec design aléatoire. Notre objectif est de comprendreles conséquences du caractère inhomogène des données sur le problèmed'estimation dans le cadre d'étude minimax. Nous adoptons deux pointsde vue : local et global. Du point de vue local, nous nous intéressonsà l'estimation de la régression en un point avec peu ou beaucoup dedonnées. En traduisant cette propriété par différentes hypothèses surle comportement local de la densité du design, nous obtenons toute unegamme de nouvelles vitesses minimax ponctuelles, comprenant desvitesses très lentes et des vitesses très rapides. Puis, nousconstruisons une procédure adaptative en la régularité de larégression, et nous montrons qu'elle converge avec la vitesse minimaxà laquelle s'ajoute un coût minimal pour l'adaptation locale. Du pointde vue global, nous nous intéressons à l'estimation de la régressionen perte uniforme. Nous proposons des estimateurs qui convergent avecdes vitesses dépendantes de l'espace, lesquelles rendent compte ducaractère inhomogène de l'information dans le modèle. Nous montronsl'optimalité spatiale de ces vitesses, qui consiste en un renforcementde la borne inférieure minimax classique pour la perte uniforme. Nousconstruisons notamment un estimateur asymptotiquement exact sur uneboule de Hölder de régularité quelconque, ainsi qu'une bande deconfiance dont la largeur s'adapte à la quantité locale de données

Thèses en Ligne

Hal-Diderot

High dimensional matrix estimation with unknown variance of the noise

Author: Gaiffas Stéphane
Klopp Olga
Publication venue: Taipei : Institute of Statistical Science, Academia Sinica
Publication date: 01/01/2017
Field of study

International audienceWe propose a new pivotal method for estimating high-dimensional matrices. Assume that we observe a small set of entries or linear combinations of entries of an unknown matrix

A_0

corrupted by noise. We propose a new method for estimating

A_0

which does not rely on the knowledge or an estimation of the standard deviation of the noise

\sigma

. Our estimator achieves, up to a logarithmic factor, optimal rates of convergence under the Frobenius risk and, thus, has the same prediction performance as previously proposed estimators which rely on the knowledge of

\sigma

. Our method is based on the solution of a convex optimization problem which makes it computationally attractive

HAL-Polytechnique

Inégalités d'oracle exactes pour la prédiction d'une matrice en grande dimension

Author: Gaiffas Stéphane
Lecué Guillaume
Tsybakov Alexandre,
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceWe consider the problem of prediction of a high dimensional matrix of size

m \times T

with noise, meaning that

m T

is much larger than the sample size

n

. We focus on the trace norm minimization algorithm, but also on other penalizations. It is now well-known that such algorithms can be used for matrix completion, as well as other problems, such as multi-task learning, see \cite{candes-plan2,candes-recht08,candes-plan1,candes-tao1, rohde-tsyb09, MR2417263}. In this work, we propose sharp oracle inequalities in a statistical learning setup

HAL-Polytechnique

HAL - UPEC / UPEM

Self-exclusion in online poker gamblers: effect on time and money as compared to matched controls

Author: Amandine Luquiens
Bacry Emmanuel
Benyamina Amine
Dugravot A
Gaiffas Stéphane
Panjo Henri
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

International audienceNo comparative data is available to report on the effect of online self-exclusion. The aim of this study was to assess the effect of self-exclusion in online poker gambling as compared to matched controls, after the end of the self-exclusion period. Methods: We included all gamblers who were first-time self-excluders over a 7-year period (n = 4887) on a poker website, and gamblers matched for gender, age and account duration (n = 4451). We report the effects over time of self-exclusion after it ended, on money (net losses) and time spent (session duration) using an analysis of variance procedure between mixed models with and without the interaction of time and self-exclusion. Analyzes were performed on the whole sample, on the sub-groups that were the most heavily involved in terms of time or money (higher quartiles) and among short-duration self-excluders (<3 months). Results: Significant effects of self-exclusion and short-duration self-exclusion were found for money and time spent over 12 months. Among the gamblers that were the most heavily involved financially, no significant effect on the amount spent was found. Among the gamblers who were the most heavily involved in terms of time, a significant effect was found on time spent. Short-duration self-exclusions showed no significant effect on the most heavily involved gamblers. Conclusions: Self-exclusion seems efficient in the long term. However, the effect on money spent of self-exclusions and of short-duration self-exclusions should be further explored among the most heavily involved gamblers

Multidisciplinary Digital Publishing Institute

SCALPEL3: a scalable open-source library for healthcare claims databases

Author: Bacry Emmanuel
Gaiffas Stéphane
Leroy Fanny
Morel Maryan
Nguyen D.P.
Sebiat Youcef
Sun Dian
Publication venue: 'Elsevier BV'
Publication date: 13/12/2019
Field of study

International audienc

arXiv.org e-Print Archive

Hal-Diderot

HAL-Polytechnique